README.TXT

The files listed below permit replicating the simulation results and the empirical application presented in
 
"Testing Monotonicity of Mean Potential Outcomes in a Continuous Treatment with High-Dimensional Data."

The simulations and the empirical application have been conducted in the statistical software MATLAB R2022b. The various files are described below.

(1) jobcorpY1.csv, jobcorpY2.csv, jobcorpY3.csv, jobcorpY12.csv
These are datasets (in csv format) that always contain the identical definition of the treatment (total hours spent in academic and vocational training in the 12 months after program assignment) and control variables measured at program assignment, but differ with respect to the outcome variables "earny4" (weekly earnings in the fourth year), "work208" (dummy for employment four years after assignment), "hrswq16" (hours worked per week in quarter 16), and "earnq16" (earnings per week in quarter 16).   
The following variables are included in each file: 
y        		outcome
d        		treatment (total hours spent in academic and vocational training in the 12 months after program assignment)
female			dummy variable for female
age			age in years
race_white		dummy for white
race_black      	dummy for black
race_hispanic   	dummy for hispanic
hgrd    		education in years
hgrdmissdum 		dummy for missing information about education
educ_geddiploma         dummy for GED diploma
educ_hsdiploma          dummy for high school diploma
ntv_engl                dummy for native English
marstat_divorced	dummy for divorced
marstat_separated       dummy for separated 
marstat_livetogunm      dummy for cohabiting 
marstat_married         dummy for married
haschldY0               dummy for having children
everwkd                 dummy for ever worked prior to program assignment
mwearn                  average weekly gross earnings in USD
hohhd0                  dummy for being household head
peopleathome            household size (number of members)
peopleathomemissdum     dummy for missing information about household size 
nonres                  dummy for being designated for nonresidential slot
g10                     total household gross income (measured in categories)
g10missdum		dummy for missing information about total household gross income
g12			total personal gross income (measured in categories)
g12missdum		dummy for missing information about total personal gross income
hgrd_mum                mum’s years of education
hgrd_mummissdum         dummy for missing information about mum’s years of education
hgrd_dad                dad’s years of education
hgrd_dadmissdum         dummy for missing information about dad’s years of education
work_dad_didnotwork     dummy for dad not working when respondent was 14
g2                      dummy for received AFDC every month
g5                      dummy for received public assistance every month
g7 		        dummy for received food stamps	
welfare_child           welfare receipt during childhood (measured in categories)
welfare_childmissdum	dummy for missing information about welfare receipt during childhood
h1_fair_poor            dummy for poor/fair general health status
h2                      dummy for physical/emotional problems
h10                     extent of marijuana consumption (measured in categories)
h10missdum              dummy for missing information about marijuana consumption
h25                     extent of hallucinogen use (measured in categories)
h25missdum              dummy for missing information about hallucinogen use
h29                     dummy for ever used other illegal drugs
h5                      extent of smoking (measured in categories)
h5missdum               dummy for missing information about smoking
h7                      extent of alcohol consumption (measured in categories)
h7missdum               dummy for missing information about alcohol consumption
i1			dummy for ever arrested
i10                     number of times in prison
e12			time spent by recruiter speaking of Job Corps (measured in categories)
e12missdum		dummy for missing information about time spent by recruiter speaking of Job Corps	
e16                     extent of recruiter support (measured in categories)
e16missdum              dummy for missing information about extent of recruiter support
e21                     dummy for having an idea about the desired training
e24usd                  expected hourly wage after Job Corps in USD
e24usdmissdum           dummy for missing information about expected hourly wage after Job Corps
e30                     expected improvement in maths (measured in categories)
e30missdum              dummy for missing information about expected improvement in maths
e31                     expected improvement in reading skills (measured in categories)
e32                     expected improvement in social skills (measured in categories)
e35                     expected to be training for a job (measured in categories)
e35missdum              dummy for missing information about expectations to be training for a job 
e37                     dummy for being worried about Job Corps
e6_byphone              dummy for 1st contact with recruiter by phone
e8_recruitersoffice     dummy for 1st contact with recruiter in office
e9ef                    expected stay in Job Corps in months

(2) func_LogitCross.m
This is a MATLAB function for implementing logistic distributional Lasso regression by K-fold cross-fitting.

(3) func_SUZ.m 
This is a MATLAB function for calculating the confidence intervals of the SUZ method described in Appendix B.

(4) func_MSE.m, func_Logit.m, func_LocalReg.m
These are MATLAB functions used in func_SUZ.m. The file func_MSE.m implements leave-one-out cross-validation for optimal bandwidth selection. The file func_Logit.m implements logistic distributional Lasso regression. The file func_LocalReg.m implements penalized local least squares estimation.

(5) sim_main.m
This is a MATLAB program for implementing the proposed test and the SUZ method in the simulations. To obtain the results of Tables 1 and 2, modify the parameter values of beta and K appropriately. To obtain the results of Table C1, set K=5 and set q1=3, 4, 5, or 6 for n=200, q1=6, 8, 10, or 12 for n=400, q1=12, 16, 20, or 24 for n=800, and q1=18, 24, 30, or 36 for n=1200. To obtain the results of Table C2, set the values of an and bn to an=sqrt(0.3*log(n)) and bn=sqrt(0.4*(log(n)/(log(log(n))))).

(6) Table1.m, Table2.m, TableC1.m, TableC2.m
These are MATLAB programs for generating Tables 1-2 and Tables C1-C2 in the simulations.

(7) job_main.m
This is a MATLAB program for loading the data files and implementing the proposed test in the empirical application.

(8) Table4.m
This is a MATLAB program for generating Table 4 in the empirical application.

(9) mat2lat.m
This is a MATLAB function for translating a MATLAB matrix into a LaTeX table. 





